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Abstract 

We investigate the stabilization of unstable multidimensional partially observed single-sensor and 
multi-sensor linear systems driven by unbounded noise and controlled over discrete noiseless channels 
under fixed-rate information constraints. Stability is achieved under fixed-rate communication require- 
ments that are asymptotically tight in the limit of large sampling periods. Through the use of similarity 
transforms, sampling and random-time drift conditions we obtain a coding and control policy leading 
to the existence of a unique invariant distribution and finite second moment for the sampled state. We 
use a vector stabilization scheme in which all modes of the linear system visit a compact set together 
infinitely often. We prove tight necessary and sufficient conditions for the general multi-sensor case under 
an assumption related to the Jordan form structure of such systems. In the absence of this assumption, 
we give sufficient conditions for stabilization. 
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I. Introduction 

A. Problem Statement 

In this paper, we consider the class of multi-sensor LTI discrete-time systems with both plant 
and observation noise. The system equations are given by 

xt+i = Ax t + Bu t + w t , y{ = C J x t + v 3 t , 1 < j < M, (1) 

where x t E W 1 and u t E W 71 are the state and control action variables at time t E N respectively. 
The observation made by sensor j at time t is denoted by E IR Pj . The matrices A, B, C J and 
random vectors w t ,v^ are of compatible size. The initial state, x , is drawn from a Gaussian 
distribution. 

Assumption 1.1: The noise processes {w 4 } and {v^} are each i.i.d. sequences of multivariate 
Gaussian random vectors with zero mean. At time t, both w t and \{ are independent of x 4 and 
each other. 

Assumption 1.2: We require controllability and joint observability. That is, the pair (A, B) 
is controllable and the pair ([(C 1 ) 7 " ■ ■ ■ (C M ) T ] T , A) is observable but the individual pairs 
(C- 7 , A) may not be observable. 

The setup is depicted in Figure [T] The observations are made by a set of M sensors and each 
sensor sends information to the controller through a finite capacity channel. At each time stage t, 
we allow sensor j E {1, . . . , M} to send an encoded value q\ E {1,2, . . . , Nf} for some N% E N. 
In addition, the controller can send a feedback value b t E {0, 1} at times t = Ts, where T is 
the period of our coding policy and sGN. The value b t is seen by all sensors at time t. We 
define the rate at time t as R t = Ylj=i 1°S2(^/)- The coding scheme is applied periodically with 
period T and so the rate for all time stages is specified by {iVp, . . . , Ni^_ 1 : 1 < j < M}. The 
average rate is 

R^=^\M + ^eA, (2) 

accounting for the encoded and feedback values. 

Information structure. For a process {x t } we define X[ 0j &] = {x a ,x a+ i, . . . ,x&}. At time 

t, each sensor j maps its information := {y^ 1 ,, 6[o,t]} — > q} E {1, . . . ,N%}. The controller 
maps its information J t c := {qL t i, . ■ ■ , — > u t E M. m . 
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Fig. 1: A multi-sensor system with finite-rate communication channels. 



B. Notation 



We denote the indicator function of an event E by 1 E . We will use M. mxn to denote the space 
of real mx n matrices and W n to denote the space of real n dimensional vectors. We let R™ be 
the space of real n dimensional vectors with all entries nonnegative. Unless otherwise stated, all 

1 T 



vectors are assumed to be column vectors. For any x e R" we write x = 



x 



x n where 



x l G K is the i entry. We define the absolute value operation for vectors as the component- 

n T 



wise absolute value. That is, Ixl 



For a matrix A e 



we denote its 



transpose by A T and determinant by det(A). If it is invertible, we denote the inverse by A 1 . 
We let A(A) denote the set of eigenvalues of A. The £ p norm is denoted by || • || p and defined 

as INI, = {£ILi \xt}*. 

Definition 1.3: For x e M n and y e R" we write x < y if \xi\ < yi for all 1 < % < n. We 
write x ^ y otherwise. 



(0) T (WAV 



fC^A n - 1 ) T 



the 



The observability matrix of sensor j is (9(cj,a) : 
null space is N j = Ker((9( C j A )) and the observable subspace is defined to be O j = (N^) 1 - for 
1 < j < M. 
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C. Brief Literature Review 

Due to space limitations, we are unable to give a fair account of the literature. We refer the 
reader to the book flU for a thorough review of the networked control literature and [2] and (31 
for a general overview of some of the related results. 

There has been an extensive study in networked control theory regarding quantizer design 
for both stabilization and optimization. References flU, and [6]| obtained a lower bound on 
the average rate of the information transmission for the finiteness of second moments. For the 
system ([T]), letting {Aj} be the set of eigenvalues of A, this bound is i? avg > i2 m i n where 



Various publications have studied the characterization of minimum information requirements 
for multi-sensor and multi-controller linear systems with an arbitrary topology of decentralization 
and the fundamental bounds have been extensively studied in d, BID. ED. ESI. ED. G2I. 

m, eh, ma and m. 

When a linear system is driven by unbounded noise, the analysis is particularly difficult since 
the bounded quantizer range leads to a transient state process (see Proposition 5.1 in [51 and 
Theorem 4.2 in ifPTlO . For such a noisy setup, a stability result of the form lim sup t _ > . 00 -E[||x( || 2 ] < 
oo was given for noisy systems with unbounded support in [5J, which uses a variable-rate 
quantizer. Under this scheme, the quantizer is applied with a very high rate during some time 
intervals. More recently, a fixed-rate scheme was presented in for a scalar noisy system 
using martingale theory, which achieved the lower bound plus an additional symbol required for 
encoding. The existence of an invariant distribution was established under the coding and control 
policy presented, along with a finite second moment of the state. That is, Hindoo _E[||xi|| 2 ] < oo. 
|fT8ll considered a general random-time stochastic drift criteria for Markov chains and applied it 
to binary erasure channels in a similar spirit. 

D. Contributions 

In view of the literature, the contributions of this work are as follows: 
• The case where the system is multi-dimensional and driven by unbounded noise over a 
discrete-channel has not been studied to our knowledge, regarding the existence of an 




(3) 



|Ai|>l 
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invariant distribution and ergodicity properties. Results for the limit properties of the finite 
moment are also new. 

• We give sufficient conditions for multi-sensor systems with both system noise and observa- 
tion noise with unbounded support, which has not been treated previously, to our knowledge. 
Our approach builds on the martingale and the random-drift programs considered in [2] and [fT8ll . 
however, new geometric constructions are needed for the vector and partially observed settings. 
We define a more general class of stopping times and adopt a further geometric approach. 

We structure the paper as follows. In Section [TTJ. we study single-sensor systems and give our 



main result for such systems, Theorem 2.3 Section II-D outlines the proof of Theorem 2.3 



The more detailed proofs can be found in Section V-A In Section [Tin we study multi-sensor 



systems and give our main result for such systems, Theorem 3.4 A supporting proof can be 



found in Section V-B Some basic definitions and results from the theory of matrix algebra, 



Markov chains and stochastic stabilization are provided in Section |V-C 

II. Single-Sensor Systems 

A. Problem Statement 

Consider the class of single-sensor LTI discrete-time systems with both plant and observation 
noise. The system equations are given by 

x.t+1 = Ax( + Bu ( + wt, y t = Cx t + v t , (4) 

where x t G W 1 , u t G IR m and y t G W 3 are the state, control action and observation at time t 
respectively. The matrices A, B, C and the noise vectors w t , v 4 are of compatible size. The initial 
state, x , is drawn from a Gaussian distribution. We label the eigenvalues of A as A 1; . . . , A„. 
Without loss, we assume that A is in real Jordan normal form and that |Aj| > 1 for all 1 < i < n. 

Assumption 2.1: The noise processes {w t } and {\ t } are each i.i.d. sequences of multivariate 
Gaussian random vectors with zero mean. At time t, both w t and v t are independent of x t and 
eachother. 

Assumption 2.2: The pair (A, B) is controllable and the pair (C, A) is observable. 

The setup is depicted in Figure [2} The observations are made by the sensor and sent to the 
controller through a finite capacity channel. At each time stage t, we allow the sensor to send 
an encoded value q t G {1, . . . , N t } for some N t G N. We define the rate of our system at time t 
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as Rt = \og 2 (N t ). Now, suppose that the channel is used periodically, every T time stages. The 
rate for all time stages is then specified by {N , . . . , N T -i}. The average rate is 

1 

-Ravg = j, &t- ( 5 ) 

t=0 
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Controller 
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Plant 





Fig. 2: A single-sensor system with finite-rate communication channel. 

Information structure. At time t, the sensor maps its information J t s := {y ro,t] } — >■ (ft G 

{1, . . . , A^ t }. The controller maps its information 7 t c := {q[o,t]} — > u f G M m . 



5. Mam Result 

Our main result for single-sensor systems is the following: 

Theorem 2.3: There exists a coding and control policy with average rate i? avg < \/{T2n) 
Y^i=i l°g2(n^j| T2n + e l +1) f° r some e > which gives: 

(a) the existence of a unique invariant distribution for {x 2n t}; 

(b) lim^oo £[||x 2n t|| 2 ] < oo. 

Theorem 2.4: The average rate in Theorem 2.3 achieves the minimum rate (|3]) asymptotically 
for large sampling periods. That is, lim^oo -Ravg = -Rmin- 



C. Coding and Control Policy 

For now, assume that A has only one eigenvalue A. We will see later how this assumption 
can be made without loss. 
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Put K — \\\\ + e] for some parameter e > and consider the following scalar (K + l)-bin 
uniform quantizer. Assuming that K is even, this is defined for k G {1,2,..., K} as 



^^ + k)A, if xe [(=f + k-l)A,(=f + k)A), 



K-l 



if \x\ 



K 



A. 



0. 



if |ar| > f A, 



where A G IR + is the bin size. The set [— |A, |A] is called the granular region while the set 



(— oo, — |A) U (yA,oo) is called the overflow region. If the state is in the granular region, 

K 



that is if | a; | < |A then we say the quantizer is perfectly-zoomed. Otherwise, we say it is 
under-zoomed. 

We write our quantizer as the composite function Qk(x) = V^(£^(x)). The encoder ■ 
{1,...,K + 1} and decoder V% : {1, K + 1} ->■ C for k G {1, 2, K + 1} are 

(-^ + x)A, 



k. 



if i£ + fc-l)A, 



K + l, 



if x 



;^ + A;)A) 



K 



2 A, 



if |x| > f A, 



if x ^ K + 1, 



if x = K + 1. 



At time t, we associate with each component x\ a bin size A\. Let $ = S^^yl). We will be 
applying our control policy to system (|9]) where y s is a meaningful estimate of the state x s . Let 
our fixed rate be N t = K n + 1 for all t G N. Choose any invertible function / : {1, . . . , K} n — > 
{!,..., K n }. We then choose the encoded value 



Qt 



Upon receiving q t ^ 0, the controller knows q\ 

l T 



f(ql,...,q?), if g *^0foralll<z<n, 
0, otherwise. 

, g". The controller forms the estimate x t as 



xi 



where 



Vp(qf] 



if Qt ^ 0, 
otherwise. 
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We assume without loss that A is a Jordan block with eigenvalue A. From the real Jordan 
canonical form (see for example [fT9l ), we know that it can be written as 



A 1 
A 



if A G 



A 



D I 
D 



if A e C, 



where in the complex case we write A = a + ib for some a, b G 

i b 



I 

D 

and define 



D 



-b a 



The update equations are 

At+i = Q (q t , A t ) A t , Q (g t , A t 
for some p > 1 and with 




if Qt = 0, 
otherwise, 



(6) 



/9(A t ) = diag(ft(A 



if A* < U 



otherwise, 



(7) 



|A|+e-»j' 

for some < rj < e and L G IR" . Note that if we define L = L|A|/(|A| + e - rj) then A* > L l 
for all 1 < % < n and all t G M. 

Bin ordering. We set L = cA , for some < c < 1. First let A G R. For any 5 > we can 
choose Aq and Aq +1 such that Aq +1 < SA l for all 1 < i < n — 1. With our update equations and 
our choice of L we get that the ordering is preserved over all time stages. That is, AJ +1 < 5A\ 
for all 1 < i < n — 1 and t G N. 

Now let A G C. We choose Aq = A +1 for all i odd. Thus, we have divided the complex 
modes into their conjugate pairs and set their initial bin sizes to be equal. Our initial condition 
implies that A\ = AJ +1 for all i odd and i G N. For any S > we can choose A and Aq +2 
such that A\ +2 < 5A\ for all 1 < % < n - 2 and t G N. 

Under our information structure, the update equations (|6]) can be applied at the sensor and 
the controller. Our vector quantizer is implementable and at time t the controller knows x t . We 
choose the control action u t = -Ax ( . 
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D. Outline of Proof for Theorem 2.3 



In this section, we outline the supporting results and key steps in proving our main result for 



single-sensor systems, Theorem 2.3 



Lemma 2.5: We can sample every 2n time stages and apply a similarity transform to x 4 in 
(|4]) to obtain x s = Px 2ns with sGN for some invertible matrix P. This new state satisfies the 
following system of equations: 



Ax s + u s +w s , y s = x s + v s . (8) 



The control action u s G IR n is chosen arbitrarily by the controller and the elimination of the B 
matrix can be justified by sampling. The estimate y s G IR n at time s is known by the sensor. 
The noise processes {w s } and {v s } are each i.i.d. sequences of zero mean multivariate Gaussian 
random vectors. At time s, w s and v s are independent of x s but may be correlated with eachother. 
For si 7^ s 2 , the vectors w Sl and v S2 are independent. The matrix A is in real Jordan normal 
form and has eigenvalues Xf 1 , . . . , A^ n . 

By a slight abuse of notation, we will rewrite system ([8]) as 

x s+1 =Ax s + u s + w sl y s = x s + v s , (9) 

where x s G IR n , u s G IR n and y s G W 1 are the state, control action and observation at time s 
respectively. 

Remark 2.6: We consider the case where A is a single Jordan block with eigenvalue A. We 
can do this without loss since we are considering the single-sensor case and the sensor obtains 



an estimate for all components, as seen in Lemma 2.5 Thus, we can simply apply our control 
policy to each Jordan block. In all remaining theorems of this section, we will work with system 
(|9]). Where necessary, we will distinguish between the real and complex eigenvalue cases. 
Lemma 2.7: The process {(x s , A s )} is Markov. 



Section II-C gives our control policy in terms of the parameters p, e and rj. 
Lemma 2.8: For appropriate choices of p, e and r], we can form a countable state space S 
for {A s }. The process {(x s , A s )} is an irreducible Markov chain on IR" x S. 
Define the sequence of stopping times 

f , , , K A 

To = 0, r z+1 = mm <^ s > t z : \y s \ = |x s + v s | < —A, 
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These are the times when all quantizers are perfectly-zoomed. We assume that this is satisfied at 
time s = 0. This technical condition is justified by showing that the process {(x s , A s )} moves 
to such a perfectly zoomed state in a finite time, which is dominated by a geometric distribution 
(see a similar discussion in |fT8lD . 

Theorem 2.9: If K is even then the following hold. 

(a) For any r > and any polynomial of finite degree Q(k) there exists a sufficiently large H 
such that Q(k)P(r z+1 - r z > k | x Tz , A T J < r~ k for all k > H and for all z G N. 

(b) Let A Tz — > oo be equivalent to stating that A* — > oo for all 1 < i < n. Then 

lim P(r z+ i - r z > 1 | x Tz , A r J = 

A Tz — >oo 

uniformly in x Tz . 
We define the compact sets 

S = S x x 5 A , S a = {A G : A* < F, 1 < i < n}, 



K 

xGl": \x l \ < —F, 1 < i < n 



for some F > L 1 where L 1 is a component of L as described in Section II-C Note that at the 



stopping time t z , if A Tz G Sa then < < ^F, for all 1 < i < n, and thus x Tz G S 1 ; 

and (x Tz , A Tz ) G S. 

Lemma 2.10: For some 7 > 0, the following drift condition holds: 



Tz+l — 1 



Xr z , A 7 



< (Alf - E[(Al z+i r I x Tz ,A Tz ] +61 {(XtziAtz)£5} . 



(10) 



For A G C, the above also holds with A 2 in place of A 1 . 

For x G IR n , we say that x l and x l+1 are a conjugate pair if z is odd. To simplify notation in 
the complex eigenvalue case we find it convenient to define for any x G W 1 , the set of vectors 



x x 



i+l 



T 



if i is odd, 



1 T 



if i is even, 



for 1 < i < n. Note that x l = x i+1 for i odd. We are only concerned with the case when n is 
even. 

Theorem 2.11: Let A G M. For i = n, there exists a k > such that 

1 \2 



E 



X 



i\2 



X T , A T 

>z * >z 



(11) 
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If lim^oo E[(x k s ) 2 } < oo then the above holds for % = k — 1. 
For A G C, with i — n — 1, there exists a k > such that 



E 



Tz + l-l 



1 fA 1 . 

t z / t z 



If Hindoo _E[(x^) T x^] < oo then the above holds for i = k — 2. 
Proof of Theorem H31 



(a) We know from Lemmas 2.7 and 2.8 that the process {(x s , A s )} is an irreducible Markov 
chain. The set S is small (see Section V-C and |fT8lD . Using Lemma |2.10 



we can ap- 



ply Theorem 5.8 with a — 1, the irreducible Markov chain {(x s ,A s )} and the functions 



V(x s , A s ) = (A*) 2 , /3(x s ,A s ) = 1 and b as given in Lemma 2.10 to get that {(x s , A s )} is 
positive Harris recurrent and has a unique invariant distribution. 



(b) Suppose that A G 1. We will apply Theorem |5.8| with a = 0, the irreducible Markov chain 

{(x s , A s )} and the functions V{x„ A s ) = (A, 1 ) 2 , /?(x s , A s ) = -f{Al) 2 , /(x„ A 
we get 



From Lemma 2.10 



E[V(x 
<(A 



A 

T z +1 5 "T z + 1 , 



X r z , A Tz ] 



1 )2 



7^; 



T a +1 — 1 



S=T Z 



61 



{(x Tz ,A Tz )eS} 



<(Kf-l(Kf + bl {(xTz ,A Tz) es } 

= V(*t z ,A Tz ) -/3(x Tz ,A T J + 61 { ( Xrz ,A Tz )e5}- 



We know that Theorem 2.11 holds immediately for {x"} and thus 

"Ti+l— 1 





ft 


"t z +i-1 






£ «) 2 


X r z , A Tz 






. S=T Z 





< 7 (A 



i )2 

T Z / 



/9(x rz ,A 7 



where we have used the ordering of bin sizes as described in Section II-C 

Thus, lim^oo ^E[(x^) 2 ] < oo by Theorem 5.8 and so Hindoo E{(x™) 2 ] < oo. This implies 



that Theorem 2.11 holds for {x™ -1 } as mentioned in the proof and theorem statement. The 
finite second moment of all components then follows by induction. 

In the complex case, we have that the drift condition ( [10] ) in Lemma 2.10| also holds with 
A 2 in place of A] since they are equal. Choosing the functions V(x a , A s ) = (A]) 2 + (A 2 ) 2 , 



/3(x s ,A s ) = 7 ((Ai) 2 + (A 2 ) 2 ), /(x,,A 



2l 'x™) T x™, we obtain the result. 



□ 
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III. Multi-Sensor Systems 



A. Problem Statement 



This is the main problem of the paper and is stated in Section I-A 
B. Main Result 

To state the main result of this section, we first present a known result and an assumption. 

The following theorem extends the classical observability canonical decomposition to the 
decentralized case. For a detailed proof in the centralized case, see [|20|| . The more general 
multi-agent setup, where each agent makes observations and applies a control action, can be 



found in pi]. We are not aware of an explicit proof and give a proof of Theorem 3.1 in Section 
IV-BI for the convenience of the reader. 



Theorem 3.1: Under Assumption 1.2, there exists a matrix Q such that if we define A 



r 1 and C j = 


= CV'Q 


~ l then 








A M 


* 


* 


A = 




A M -i • • 


* 













- QM - 






* 


* 










* 


. c 1 












(12a) 



(12b) 



where the *'s denote irrelevant submatrices, each A 7 G M. njXnj and each C 3 Q e 



Remark 3.2: In the proof of Theorem 3.1 we give one construction for the triangular de- 
composition in ( fl"2~] ). This transformation is not unique. There may be many ways to achieve a 
block upper triangular form and it is not necessary to place the sensors in order M, . . . , 1. 

Let us label the Jordan blocks of A as J i,..., J^. Let Vi be the (possibly generalized) 
eigenspace corresponding to Jj. That is, if v^i, . . . , are the (possibly generalized) eigen- 
vectors associated with Jj then Vi = spanjv^i, . . . , v^} and has dimension di. 

Assumption 3.3: Each eigenspace is observed by some sensor. That is, for each 1 < i < £ 
there exists a 1 < j < M such that Vi C O j . 
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The following is the main result of this section: 



Theorem 3.4: Under Assumption 3.3 there exists a coding and control policy with average 
rate i? avg < l/(T2n)(M + ELi l og 2 (\\Xi\ T2n + e] + 1)) for some e > which gives: 

(a) the existence of a unique invariant distribution for {x 2n t}; 

(b) limfr-too-EfllxantHa] < oo. 

Theorem 3.5: The average rate in Theorem 3.4 achieves the minimum rate Q asymptotically 
for large sampling periods. That is, lim T ->oo -Ravg = -Rmin- 

Proof of Theorem 3.5\ Follows from the proof of Theorem 2.4 



□ 



Proof of Theorem 3.4: Under Assumption 3.3, we can assign each eigenspace Vi C O 3 to 
some sensor j. Let Vj ; i, . . . , Vj >mj denote the eigenspaces assigned to sensor j and let us write 



'3 

V jti = span{v jii) i, . . . . v J where each G 

T 



snxl 



Q 



J 



and Q 



We put Qj ti 

T 

(Q nT 



0'.*4 



T 



(Q M ) T ■■ 

Each Vj j /j belongs to the generalized eigenspace Vj t i, which is invariant under multiplication 
by A. That is 



(13) 



We apply the similarity transform x 4 = Qx t to ([T]) and define A = QAQ 1 , B = QB and 
w t = Qw t to get the system 



x f+ i = Ax( + Bu ( + w t . 
Furthermore, we can write A = diag(A^, 



, Ai) where A, G 



(14) 

% i and rij = Ya=\ is 



the sum of the dimensions of V^i, . . . , Vj )Tnj . Equation ( |13| ) is analogous to ( |30| ) in the proof of 



Theorem 3.1 and from this proof, we obtain the desired diagonal form. 



We now look at the estimation of the state by the sensors. For convenience, let us write x t 



where x^ 



with x 



j,i,h 



I. Let us write O 



(,d,A) 



and x. J t 

T 



J' 1 



o 



3,npj , 



where each G 



-j,i,dj,i 



)lxn 



With our construction above, under Assumption 3.3, we have for each j,i,h that Vj^h = 
Y^i=i kf l ' h °j/ ^ or some rea l coefficients {kf l ' h }. Consider the first n time stages. By putting 
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n npj 



, it follows that 

T 



T 



Ei iAth^ i —j,i,h i —j.i.h 

kf Oj/xo + v J " = Vj- j^xo + v J 



X 



j.i.h i —j.i.h 
+ f n" 



where u i ' 1 '' 1 is some zero mean Gaussian noise. We will use the same notation for v J that we 



use for . 



As in Lemma 2.5 for the single-sensor case, we can use the next n times stages to apply a 
control action. We then apply the above scheme repeatedly and sample every 2n time stages. 
By a slight abuse of notation, we define x s = x 2 „ s , A = diag(A Af , . . . , A x ) = A 2n , u s = U2ns, 
w = w 2ns and = \- 3 2ns to get the system 



where x s 



Ax s + u s . 



r M\T 



w s , yl = xi + vi, l<j<M, 

T 



(15) 



, u s is chosen arbitrarily by the contoller and is known 
by sensor j at time s. The noise processes {w s }, {v^} are each i.i.d. sequences of zero mean 
Gaussian random vectors. At time s, w s and each are independent of the state x s but may 
be correlated with eachother. For si ^ S2 we have that w Sl and v S2 are independent. 

Finally, we can assume that each Aj is in real Jordan form. Using the same notation for x s , A s 
as for x t , we associate with each x^ %,h the bin size A J s ' l ' h . We define the sequence of stopping 
times 

To = 0, r z+1 = mm{s > r z : |y s | = |x s + v s | < AJ. 



The feedback value bor,* is chosen as 



'2ns 



1, if s = t z for some zGff, 
0, otherwise, 



so that we can then apply the same coding and control policy as in Section II-C This reduces 
the problem to the single-sensor case and we obtain the result. □ 



C. Sufficient Conditions for the General Multi-Sensor Case 



In Section III-B Assumption 3.3 allowed us to diagonalize A in ( [12] ) in Theorem 3.1 Without 
this assumption, the lower components of the state act as noise for the upper components. In 
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particular, we need to bound these lower modes when all quantizers are perfectly-zoomed to 



achieve (b) of Theorem |2.9| To do this, we must have that the bin sizes of the lower modes are 
small compared with the upper ones. With many different eigenvalues, we cannot guarantee this 
in the general case. Below, we give a sufficient rate and an alternative assumption for stability. 

3.6 below, let us write A(A 3 -) = {Xj,i, • • • , Xj, n } where Aj is given in ( fl"2"] ). 



For Theorem 



Theorem 3.6: There exists a coding and control policy which gives: 

(a) the existence of a unique invariant distribution for {x 2n t}; 

(b) lim^+ooEIUxantlla] < oo, 

and with average rate in the limit of large sampling periods 



lim R 



avg 



M rij 

EE 

j=l i=l 



log 2 (max{|Aj ii |, \\ h ,e\ ■ h < j, 1 < £ < n h }). 



Proof of Theorem 3.6: The proof follows that of Theorem 2.3 The main difference is that we 



r(Ay 2 - + 6] 



define A' ;i = max{|Aj,i|, \Xh,e\ '■ h < j, 1 < £ < Uh} and the bin numbers Kjj 
for some e > and treat the lower components of the state as noise. □ 

Cleary, we could also achieve (a) and (b) in Theorem 3.6 with lim^oo -Ravg = n log 2 (A a b S max) 
where A absmax = max^-flA^I). 



For Theorem 3.7 below, recall that we have some flexibility in the decomposition given by 



Theorem 3.1 See the proof of Theorem 3.1 and Remark 3.2 



Theorem 3.7: If the eigenvalues of Am, ■ ■ ■ , Ai in ( 12) are ordered in decreasing magnitude 



then Theorem 3.4 holds without Assumption 3.3 That is, the theorem holds if for Aj G A(A, 
and Aj e A (Ay) we have that |Aj| < \Xj\ when i < j. 



Proof of Theorem 3.7; The proof follows that of Theorem 2.3 Since the eigenvalues are 
ordered in decreasing magnitude, we can maintain the ordering of the bin sizes and treat the 
lower components as noise. □ 

Finally, a remark on the vector scheme we have employed in this paper is in order. 

Remark 3.8: In this paper we present a vector stabilization scheme. From the problem 
statement, it would be natural to adopt a sequential stabilization scheme. That is, each of the 
components of the state is viewed as a separate system. In this case, we lose the Markov property 



and the number of time stages we must wait (denoted by H in Theorem 2.9 ) to establish geometric 
decay is not uniform across the set of valid conditions (x Tz , A r J. Such a scheme is left for future 
work. 
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IV. Conclusion 

In this paper, we have presented a coding and control policy which achieves the minimum rate 
asymptotically in the limit of large sampling periods. We extend this result to the multi-sensor 
case under the assumption that each eigenspace is observed by some sensor. In the absence of 
this assumption, we give sufficient conditions for achieving stability. In all cases, we establish 
the existence of a unique invariant distribution for the sampled state and a finite second moment 
of the state. These strong forms of stability have not been considered in the literature for such 
systems to our knowledge. The proofs use random-time drift criteria for Markov chains. We 
wish to extend the results for control over general noisy channels along the lines of Il22l . [|23l , 
Il24t ||25H and (3*3. 



V. Appendix 

A. Supporting Results for Section \1I-D 



Proof of Theorem 2.4; Let {Ai,...,A n } denote the set of eigenvalue of A with multi- 



plicity. We give our control policy for period T = 2n with a fixed average rate of R avg = 
2^ log 2 ({nr=i n^«| 2n + e l } + 1) • Suppose that instead of sending an estimate every 2n time 
stages, we apply them periodically every T2n time stages. Taking the limit as T approaches 
infinity, our average rate satisfies 

Um R ml < lim ±- (x>g 2 (nA,r + ^1 + K 

\i=l 

tn \ n 

J2 lo ^(\\M T2n + a + 1)^ =5> g2 (|A,i). 
i=l J i=l 

In this sense, our policy achieves the minimum rate ([3]) asymptotically. □ 



Proof of Lemma 2.5: Recall the basic recursion for LTI systems. 



x t = Ax t _i + Bu 4 _i + w t _i = A 2 Xi_ 2 + ABu t _ 2 + Bu<_i + Aw f _ 2 + w t _i 
t-i t-i 
■■■ = A'x + A^Bu; + A^-Vi. 

i=0 i=0 

In the first n time stages the sensor makes observations on the state and forms an estimate. In 
the second n time stages we allow the controller to apply a control action. 
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We set u; = for < i < n — 2 so that the first n observations of the sensor are 



Cw 



yo 
yi 



yn-i 



C(C,A) X + 





v 




Vi 


+ 






V n -1. 



where 0(c,A) is the observability matrix of the pair (C, A). We have assumed that (C, A) is an 
observable pair. Equivalently, C(c,A) nas full column rank. By choosing a subset of n equations 
from the matrix equation above, it is clear that we can apply the inverse to obtain the estimate 



y = x + 5]" =0 2 & w i + Yli=o CM, for some set Q} of matrices & e M nxn and e 



>n-l 



t>nxp 



Our estimate y is generated at time n — 1 . At this time stage, the sensor sends the encoded 
value <7 n _i to the controller through the finite capacity channel. Based on this information, we 
allow the controller to apply control actions in time stages n to 2n — 1. This is standard and we 
do not describe it in detail. We then have the system of equations 



2n-l 



n-2 



n-1 



x 2n = A 2n x + u + A2n 1 * w *> = x o + 5^ &Wj + ^ C*Vi, 

i=0 i=0 i=0 

where at time n — 1, the estimate y is known by the sensor and the action u is chosen arbitrarily 
by the controller. 

Let us define the sampled variables x s = x 2ns and y s = y2 ns - We define the noise processes 

2n— 1 n— 2 n— 1 

W s = ^ A 2 " 1 l W 2ns -|-i, V s = ^ &W 2ra8 +i + ^ v 2ns+i, 
i=0 i=0 i=0 

and note that they are both sequences of i.i.d. multivariate Gaussian random vectors. Then, by 
repeating our procedure every 2n time stages, we obtain the system 



A 2n x 



Finally, we apply a real Jordan transformation to the above system. We define x s = Px s , 
A = PA 2 "P _1 , u s = Pu s , w s = Pw s , y s = P _1 y<5 and v s = P" 1 ^ where P is the Jordan 
transform matrix. This gives the system 

x s+ i = Ax s + u s + w s , x s = x s + v s . 

Note that the matrix A has eigenvalues A 2n , . . . , A 2n . □ 
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Remark 5.1: The estimate used in Lemma |2.5| may appear naive. At first glance it would 



appear better to apply the Kalman filter. In this case, a new system is formed with the estimate 
as the state. The problem is that the noise for this system is not independent across time and 
we cannot extend our result to the multi-sensor case. 



Proof of Lemma 2.7; Note that under our control policy we can write u s = g(x s ,v s , A s ) 

and A s+ i = /(x s ,v s , A s ) for some functions g and /. 

Let B(R n x R") be the Borel cx-field on R n x IR™ . It follows that 

P ((x a+ i, A s+1 ) G (C x D) | (x s , A a ),..., (x , A )) 
= P (x s+1 G C | A s+1 G D, (x s , A s ), . . . , (x , A )) 
P(A S+1 G D | (x s ,A s ),...,(x ,A )) 

= P (Ax s + u s + w s GC A s+1 G D, (x s , A s ), . . . , (x , A )) 
P (/(x s , v s , A s ) G D | (x„ A s ), . . . , (x , A )) 

= P (Ax s + ^(x s , v s , A s ) + w s G C | A s+ i G -D, (x s , A s ), . . . , (x , A )) 
P {f(x„ v s , A S )ED \ (x s , A s ), . . . , (x , A )) 

= P (Ax s + #(x s , v s , A s ) + w s GC| A s+ i G D, (x s , A s )) 

P(/(x s ,v sl A s )eD (x s ,A s )) 
= P ((x s+1 , A s+1 ) e (C x £)) | (x s , A s )) , 

for all (C x D) e B(R n x E"). □ 



Proof of Lemma 2.8: This follows immediately from the scalar case, as presented in the 
proof of Theorem 2.4 of fl2). We can choose p, e and i] such that \og 2 (Q(q s , A s )) takes values 
in integer multiples of I and the integers taken are relatively prime. By setting each Aq to be 
an integer multiple of £, it follows from the equation 

log 2 (A* s+1 )/£ = \og 2 (Q(q s , A s ))/£ + log a (Aj)/£ 

that log 2 (A*) in an integer multiple of £ for all s G N. □ 
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To prove Theorem 2.9 we need the following simple Gaussian bound. Recall that A(-) denotes 
the set of eigenvalues of its argument. Let us define A min (A) = minA(A) and A max (A) = 
max A(A). 

Lemma 5.2: Let X ~ A/"(0, S) be a multivariate normal random variable with mean zero 
and covariance matrix £ £ IR nxn . For A £ W! , the following bound holds. 



k * '- V2!rdet(E)^ v \ 2A m „(S)J 



Proof of Lemma 5.2: Let X ~ A/"(0, S) be a multivariate normal random vector with mean 
zero and covariance matrix £ £ IR nXTl . We avoid the degenerate case and assume that £ is 
positive-definite. Let A £ E". Then 

P(X £ A) = P(U™ =1 {> 1 | > A 1 }) < Y^P{\x*\ > A*) 

i=l 

= V/ 1 =exp{--x T S~ 1 xldx 

jriJ\^\>^ v/(27r)Met(S) I 2 J 

< V / 1 =exp(--A min (E~ 1 )x T xldx 



V27rdet(S)A 



expj-iA^^S- 1 )^) 2 }' 



2 A 1 



v 



A m in(5j X ) 



v/27rdet(S)A^(S-i) tT A " 

n i r i i n r i 

C E^ ex P "2 U(rl)(A ' )2 - C E H P --A mm (S- 1 )(A^ > 

i=l ^ J i=l ^ 



where the last line follows since we ensure A* > 1 for all 1 < i < n under our coding and control 
policy. We have also defined the constant C — 2/( A/27rdet(S)A^ 1 | n 1 (S -1 )) . The eigenvalues of 
are the inverse eigenvalues of S. This gives the desired bound. □ 
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Proof of Theorem \2.9\ i) Exponential Bound. Note that 

P(t z +i -r z > k\ x Tz , A Tz 



x A 



K 



P ( \*T z +k + V Tz+fc | ^ yA Tz+fc 



fc-1 



s=l 



A' 



fc 1 f A 1 



< P [ |x Tz+fc + V Tz+fc | ^ yA Tz+fc 



P I |x Tz+fe + v Tz+fe | % yA 7 



: +fc 



r^+i - t z > k - 1, x rz , A Tz J . (16) 

We first let A G M. Let us define the noise vector w Tzik = j^r(— v Tz + ^s=o -A. -1-s w T . s + s ) + Vr ^ fc +fc 
and note that it is multivariate Gaussian. Before obtaining our bound, we define £ = |"|A| + 
e]/(|A| + e — 77) > 1. We let iV denote the nilpotent matrix (the matrix with ones on the upper 
diagonal) of appropriate size. Note that N s = for all s > n. Under our control policy, as 



described in Section II-C we know that |(x rz + v Tz ) — x Tz | < ^A Tz . It then follows that 

Tz+i -r z > k- l,x Tz , A Tz 



P (V^+fc + v Tz+k \ i yA 



PI 



r z +k 



fc-1 



> K A 

% yA Tz+fc 



PI 



A fc x Tz + A k l u Tz + Ak 1 Sw t z +s + v Tz+fe 

t z +i - r z > k- l,x Tz , A Tz 
fc-1 

A fc (x Tz - x rz + v Tz - v Tz ) + A fc - 1-5 w T , + , + v Tz+fc 

s=0 

T z +1 -t z > k - l,x Tz , A T; 

(A/ + AO fc (x Tz + v rz - x T J + A k f-v Tz + A" 1 " s w Tz+s I + v Tz+fc 
A 

^ y A r z +fc t- z+1 - r 2 > k - 1, x Tz , A Tz 



> K A 

£ y^r z +fc 



fc-1 



s=0 



September 21, 2012 



DRAFT 



21 



<P 



< 



- { 'x Tz + v Tz -x T J + X k w Tz:k 





A 




1* 


4 


- e 


-v 



I z 



x r 2 5 ^r 2 



X t z + V Tz -X T J + |A| fc |w T ^fc| 



X T 2 ) A Tz 



< 



p(iAi^A. + 4A.EQiAr + 



Wt. ) *|^^ 1 |A|^-A T| 



< P [ ^A Tz + 5 l -A Tz nk n + |w T „ fc | i P k -^\A TZ 



x Tz , A 7 



X r z , A Tz 



(17) 



X r z , A Tz 



(18) 



x Tz , A Tz I < 2^ 



27rdet(S Tz , fc ) ^ P 1 8A max (S Tz , fc ) 



(19) 



where ( [17] ) follows from our bin ordering. Equations ( [18] ) and ( fT9| ) hold for all > if for 
some H sufficiently large and in the special case of k = 1. In the case k = 1 we choose 5 



sufficiently small such that £ — 1 — <5n > 0. Equation ( fT9[ ) holds for some 1 < p' < p since we 
need only show that p fe_1 £ — 1 — 5nk n > (p') k ~ l f° r sufficiently large k and this follows since 
lim fc _^ 00 p fc - 1 /(p / ) fe ~ 1 = oo and lim fe _ >00 (-l - 8nk n ) / (p') k ~ 1 = by L'Hopital's rule. In ([19]), 



we have used Lemma 5.2 with the zero mean Gaussian vector w T k and denoted its covariance 



matrix by S Tz k . From (19), we can see that (b) of Theorem 2.9 holds 



In order to bound ( fT9] ) further, we define the covariance matrices S v = E[v s vJ], 
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E[v s wJ] and S w = E[w s wJ]. Then 



S Tzifc = E[w Tz:k w^ k ] = E 



A k 

A^ 



fc-i 



A fc 
A^ 



fc-i 



s=0 



A^ 



s=0 

V T z + fc 

A fc 
fc-i 



V T z + fc 

A fc 



S v - S v ,w(A _1 ) T - A- 1 ^^ + A- 1 - S S W (A- 1 - 



(A 



k\T 



s=0 



A fc = 



where we have used the independence of {w s }, {v s } across time and the independence of v si 
and w S2 for si ^ s 2 . Since both processes are zero mean, the cross terms are zero. 
Recall that 

\ k ©A*" 1 ••• (JLJX"-"** 
\ k 

••• g)A fc " ] 
\ k 

and let Tr(-) denote the trace of its argument. We get that 

n— 1 £ /7,\ 2 n ~ 1 /lA 2 

Tr(A fe (A fe ) T ) =EE( J A2(fc_S) " n S ( J A2(fe ~ S) " 

^=0 s=0 ^ ' s=0 ^ ' s=0 

Similarly, we can see that 7V(A fc - 1 - s (A fc - 1 - fl ) T ) < \ 2k n 2 k 2n for all < s < k - 1. 

Define S x = E[(-v Tz + A- 1 w T J(-v Tz + A^wJ 3 ] = S v - S V , W (A- 1 ) T - A^E^ + 
A _1 S W (A _1 ) T . For symmetric matrices, every eigenvalue has an eigenvector. Thus, for £ Tzi fc, 
there exists a vector of unit length eeR" such that 

A max (£^) = e T S r2 , fc e = ^(e r A fe )S 1 ((A fc ) T e) 

fe-i 



n-l /jfcN 2 



< A 2 W" 



+ i E(° rA *- 1 -) s w(( A *- 1 -) Te ) + )i eT ^ e 



s=l 



A 2fc 



1 



< ^A max (S 1 )e T A fe (A fc ) T e 



fe— i 

+ ^A max (S w )^e T A fe - 1 - s (A fc - 1 - s ) T e+ ^A max (£ v )e T e 



s=l 



A 2fe' 



< ^A max (S 1 )A max (A fc (A fe ) T )e T e 



fc-i 

+ ^A max (S w )A max (A fc - 1 - s (A fe - 1 - s ) T ) e T e + A max (S v ) 



s=l 
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< ^A max (S 1 )Tr(A fe (A fc ) T ) + fc^A n ^(£ w )Tr(A*- 1 -'(A*- 1 -') T ) + A max (£ v ) 
^ n k (A max (Si) -f- A max (S w 

) + A max (S v )). 

Recall Minkowski's Determinant Theorem (see for example lETTO . For nonnegative definite n x n 
matrices A and B, it follows that det(A + B) > det(A) + det(B). 
Using the above bound and the identity det(A) = detA T we get 

1 



det(S Tz , fc ) > ^det(A fc ) 2 det fl^ + £ A-^E^A" 1 -')^ + ^det( 

/fc-1 \ fe-1 

> det(S!) +det K] A- 1 - S S W (A- 1 - S ) T > det(Si) + ^ det(A- 1 - s S w (A- 1 



,s=l / s=l 

k-l 



det(Si) + det(E w ) ^(A" 1 ^) 2 ™ > det(Si; 



Defining the constants c\ = n 2 (A max (S 1 ) + A max (S w ) + A max (S v )) and c 2 = det(S 1 ) we have 
obtained the bounds 



A max (S!) < Cl k 2n+ \ det(S rzife ) > c 2 . (20) 
Combining ( [20] ) with < fl"9] ) we get that 



i=l v ^ 

where C is the appropriate constant. 

ii) Geometric Bound. Note that in pT) we have a double exponential in A; since (p') 2 ^ -1 ) = 
e (fc-i)2iog( P '). Let a ,b,c> and recall that lim*-** e fe /((a + 6)A; C+1 ) = oo by L'Hopital's rule. 
This means that for all L > there exists an iV such that e k /((a + b)k c+l ) > L, for all k > N . 
Thus, e k /k c > L(a + b)k, for all k > N. Then choosing N large enough so that (a + b)N > 1 
and subtracting (a + b)k we get e fc /A; c — (a + &)A; > (L — l)(a + 6)A; > L — 1, for all k > N . 
Therefore, we have that Hindoo e k /k c — (a + 6) A; = oo. Since log (A;) < k for A; > 1, comparing 
with the above we find that Hindoo e k /k c — alog(k) — bk = oo. 
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Let Q(k) be a polynomial of finite degree m. We can write Q(k) = a + a\k + • • • + a m k r ' 
for some coefficients a , . . . , a m G IR. Let r > 1 and consider 



lim r Q(A;)exp 



lim > aj/cV fc exp 



i=0 



e 



lim 7 a^exp <( — f- + ilog(fc) + log(r)A; \ = 0. 

i=0 



(22) 



Combining pT| ) and ( |22| ) gives the result for A G M.. The case of A G C is similar and we omit 
it. □ 



Proof of Lemma |2.10^ We take A G E. The proof for A G C is identical. We use the 
abbreviations P z (k) = P(t z+1 - r z = k | x T2 , A T J, P z (k \ Y) = P(t z+1 —r z — k\Y, x Tz , A r J, 
P z (k) = P(r 2+1 -r z >k\ x Tz , A T J and P z (k \ Y) = P(r 2+1 -r z >k\Y, x Tz , A T J. 

Put r > p 2 |A| 2 . Using Theorem 2.9[ we can bound the first term in ( |T0| ) using the law of 
iterated expectations as follows 



E 



Tz + l" 



x r z , A Tz 



k-l 



J2 P Mj2 E ^s) 2 \r z+ i-r z = k } Al z 



k=l 



k-l 



1 2/,- 



k=l s=0 k=l 

<(Alf(f2kP z (k)p^\xr+ (^) k )=(KfG, 



(23) 



,fc=l k=H+l v 7 

We have defined G x = £f =1 £;P 2 (A;)|A| 2fc + T,T=h+i(p 2 \ X \ 2 / r ) k < oo. The series on the right 
converges since it is geometric. Similarly, we can bound the term P[(A* z+i ) 2 | x Tz , A T J. Using 
the law of total expectation, we get 

£[(A^ + J 2 | x Tz , A T J = P z (l)E[(A l Tz+i f | r 2+1 - r z = 1, A T J 

+ P z {l)E[{A\ +i f | r z+1 -r z > l jXr „ A r J = P 2 (l)P[(A^ +i ) 2 | r 2+1 - r 2 = 1, A T J 

+ P,(l) J E[E[(A^ +i ) 2 | r z+1 -r z > l,r 8+1 -r z ,x Tz ,A T J | r 2+1 - r z > l,x rz , A T J 

oo 

< P 2 (l)P[(A^ +i ) 2 | r z+l -r z = 1, A T J + P,(l) £ P 2 (A; | r 2+1 - r 2 > l)p 2 |A| 2 (A^) 2 



fe=2 



P 2 (1)P[(A 



Tz + l ' 



r 2+ i-r 2 = l,A r J+P 2 (l)G' 2 (A 



i )2 



(24) 
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where we have defined G 2 = Ylh=2 Pz{k \ t z+1 — r z > l)p 2 |A| 2 < oo. Convergence comes from 



the geometric decay, as in the previous bound. Note that the geometric bound in Theorem 2.9 



still holds with P z (k \ t z+ i — r z > 1) in place of P z (k) since we obtain our bound by looking 



only at the t z + k term, as can be seen in (16). 



There exists a ( such that < ( < 1 — (|A|/(|A| + e — rf)) 2 . We know from Theorem 2.9 that 
limA Tz ^oo -Pz(l) =0. Recall that A* > L l for all t EN. Then, we choose L large enough to get 
an appropriate L such that P 2 (1)G 2 < £. We put 



7 



W+e-V 



so that 7 > 0. Now, if A Tz £ S A then we have that A\ z > F since A] > A 2 > 
all t G N by construction. Since F > L 1 , the bin size shrinks and 

E[(A l Tz+i ) 2 \r z+1 -r z = l,A Tz ]= f 

If A Tz G S*a then we use the simple bound E[(A\ \ 2 



> A? for 





A 




1* 


+ e 


-7] 



(A 



1 ?■ 

Tz' 



r z+l -T z = l,A Tz }<p 2 \\\ 2 {A 



2| \ |2^ A 1 ^2 



From the above, we have the following bounds. If A Tz £ Sa then 

2 



P[(A^ +i ) 2 |x Tz ,A r J<(A^) 2 |( 



A 



n 



C ■ 



If A r G Sa then 



P[(A^ +i ) 2 |x rz) A r J<(A^) 2 {p 2 |A| 2 + C}. 



(25) 



(26) 



In the case A Tz S A we apply ((23), (24 1 and (25J) to get 



-fE 



X r 2 , A Tz 



< (Ai ) 2 tGi 



(A 



i )2 

T 2 / 



(pqT7^) 2 -c} s (a;j 2 - £ [(a; i+ ,) 2 |x t „a t . 



In the case A Tz G Sa we apply fl2~3]>, d24j) and d25J) to get 



7^ 



"r 2 

T z + 1-1 



x T , A 7 



< (Ai) 2 7C?i = (A 



i )2 



|A| 



(A 



i ^2 



(A^) 2 {p 2 |A| 2 + C} + (AM 2 p 2 |A' 2 



IA 



A| + e - 7] 

2 



< {A\f - E[{A\ z+i ) 2 |x Tz ,AJ+F 2 |p 2 |A| 2 - (pq^^) 



September 21, 2012 



DRAFT 



26 



We set b = F 2 {p 2 |A| 2 - (|A|/(|A| + e - t])) 2 }. Since A Tz G S A if and only if (x Tz , A T J G S, we 
obtain Lemma 12.101 □ 



Proof of Theorem 2.11 : Consider first the case A G E and let = 0. Using the law of 
total expectation we get 



E 



E 



J\2 



X Tz ) A T 



T«4-l — 1 



:o 2 + E ( A <-i+^-i 



«i-i+wj-i) 2 



=t z + 1 



t 2+ i - r 2 , 


X r z , A Tz ] X 


t z ! ^t z 




oo 


k- 


1 / 






(4) 2 +E ^ 


i 


fc=l 




1 \ 




5-1 












X r z , Ar z 


3=0 









6—1 



3=1 



s-l-i„i+l 

X T z +i 



k=l 



fc-1 



J \2 



+ E + < - 4.) + a- 1 ^ 1 + <f - *;f) 



s-l 



8-1 



+ E A- 1 "^ - A s < - A- 1 ^ 1 + E A 5 " 1 

3=1 3=0 



^+3 



x A 

A T Z ) 



< 



k=l 



k-1 



[X 



i \2 



E (a 2s « + < - O 2 + a 2 ^- 1 ^ 1 + <+ x - 4+ 1 ) 1 



s=l 
2 



+ ('£a- 

o=o 



Tz+J 



X r z , A Tz 



(27) 



k=l 

s-l 



K 2 



(AU 2 + g^Q A g 2 +A 2 (s -i) Q A;f y 



+ s E A2(s-1-i) (4^) 2 + a 2s «) 2 + \ 2{s ~ 1] Kff 

3=1 

+ «Ea 2 ^ 1 -)« +j ) 

3=0 



X r z , A Tz 



(28) 
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oo / 1 i 

< 6 E P ^ X (A - )2 + E ( A2 i + s 2 A 2s M i+1 

k=l V s=l 



i \ 'Zs '£ i \'ls „i , „2\2s„2 



< 



k=l I s=l ^ 



+ s 2 M l+1 + a 2 , + a 2 m + sV 2 



(29) 



In (27) and (28) we have used Jensen's inequality. Line (29) follows since we can bound A* > 1 



for all seN.We have defined M 1 = sup seN E[{x\ ) 2 ] < oo, = £[(^) 2 ] and cr^ = E[(w[ 
The fact that M % is finite for 2 < % < n — 1 follows from induction in the proof of Theorem 
2.3 By convention we put M n+1 = 0. Now, we apply Theorem 2.9 with Q(k) = k 3 and r > A 2 



to yield 



k-l 



H k-l 



oo k— 1 



£p^)5> 2 V = £J>(*)a 2 V + £ 5> 2 ^(fcy 

fc=l s=l k=l s=l k=H+l s=l 

oo oo / ^ 2 \ *: 

< OO. 



oo oo /• . 2 \ i 

<g + x^p z (k)k 3 <G+ y: m 

k=H+l k=H+l ^ ' 



The last series converges since it is geometric. We have defined G = J2k=i ^2s=i Pz{k)\ 2s s 2 < 
oo. Therefore we can set 



fe-i 



K 



6 PM(K 2 /A + £ A 2s (1/2 + s 2 M^ + + < m + sV 2 ,) < oo 



k=l 



to get the result. For A G C, the proof is similar and we omit it. 



□ 



B. A Supporting Result for Section \III-B 
Proof of Theorem 



3.1 



We define n x = diir^O 1 ), and rij = dim(0 J '\{U^ 1 1 O i }), for 2<j< 
M. We choose rii linearly independent row vectors from 0(c!,a) an d label them qj, . . . , q* . 
Proceeding by induction, we choose {q{, . . . , q£.} from 0(cj,a) suc h that 

{q|, • • • , q^, q?. • • • . q^ ■ ■ ■ > qi, • • • . q£ 3 } 

is a set of linearly independent vectors. 



We define 



(qi) 



(qy T 



to choose our transformation matrix Q 



for all 1 < j < M and concatenate these matrices 

T 



(Q 



M\T 



(Q 



1\T 



to denote the rows of Q by qi, . . . , q n so that Q 



qi 



. It will also be convenient 

(qn) 5 1 ' 
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From the Cay ley-Hamilton Theorem, we know that for all m > n there exist a , . . . , a n _i 
such that A m = YH=o Since {q{, . . . , q^.} are rows of 0(ci,A)> this implies that q^A is 

in the row space of 0(cj,a) f° r all 1 < z < rij. Let us define the sets 

Sj := {q\, . . . , q^, q?, . . . , q^ 2 , . . . , q{, . . . , q^ }, 1 < j < M 

From our construction, it is then clear that 



qiA,...,q J n A e span(Sj)- 

We write A in terms of its column vectors as A 
1 < % < n. Our similarity transform gives 

AQ = QA. 



ai 



where a, e 



(30) 

i nxl for each 
(31) 



Recall from linear algebra that we can write the left side of ( |3Tj ) as Yli=i where ajqj e M. nxn 
for each 1 < i < n. Now, to return to our earlier notation, each vector q^ A is a linear combination 
of {q^ : 1 < k < j,l < £ < n k } and is linearly independent of the remaining rows of Q. Since 
QA 



1 T 



( qi Af ■■■ (q„Af 
of qjA with respect to q x , 
a it h G IR so that (|3~Tj) gives the system of equations 



, we see from pTj ) that the i row of A is the representation 

T 



, q n . More precisely, we write a. ; 



CI; 



with each 



(32) 



i=l 



Combing p0| ) and ( [32] ) gives the desired form. 

We next turn our attention to the form of G 7 '. Since each C J is a submatrix of C^c^a)* it is 
clear that the rows of C J are in the span of Sj. Since C J Q = C J , by writing in terms of its 
column vectors we obtain the desired form. □ 



C. Review of Symmetric Matrices, Markov Chains and Stochastic Stability 
Recall that a matrix A 6 l nxri is said to be symmetric if A T = A. 

Lemma 5.3: Let A e IR nxn be a symmetric matrix with eigenvalues Ai, . . . , A n . If we let 

A min = min{Ai, . . . , A„} and A max = max{Ai, . . . , A„} then A^x^x < x T Ax < A max x T x for 
all x e W 1 . 
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We present a brief discussion on stochastic stability of Markov chains. For a list of definitions 
on Markov chains, the reader is referred to ll28l and ||29ll . Let <p — {4>t,t > 0} be a Markov 
chain with a complete separable metric state space (X, £>(X)), and defined on a probability space 
(f2, J 7 , V), where £>(X) denotes the Borel a— field on X, is the sample space, T a sigma field 
of subsets of f2, and V a probability measure. Let P(x,D) := P((f) t+ i E D\(fi t = x) denote the 
transition probability from x to D. 

Definition 5.4: For a Markov chain, a probability measure n is invariant on the Borel space 
(X,£(X)) if n{D) = J x P(x,D)7r(dx), \/D E B(X). 

Definition 5.5: A Markov chain is fi-irreducible if for any set B E i3(X) with fi(B) > and 
Vx E X, there exists some integer n > 0, possibly depending on B and x, such that P n (x, B) > 0, 
where P n (x,B) is the transition probability in n stages. That is P(4>t+ n £ B\4> t = x). 

Definition 5.6: A set A C X is small if there is an integer n > 1 and a positive measure fx 
satisfying //(X) > and P n (x, B) > fi(B), Vx EA,B E B(X). 

Definition 5.7: A set A C X is (— petite on (X, £>(X)) if for some distribution Z on N (set of 
natural numbers), and some non-trivial measure £, Y^=o P n { x i B)Z(n) > ((B), \/x E A, B E 
B(X). 

In the following, let T t denote the filtration generated by the random sequence {0ro,ti}. Define 
a sequence of stopping times {71 : % E N + }, measurable on the filtration described above, which 
is assumed to be non-decreasing, with 7o = 0. 

Theorem 5.8: lfT8ll Suppose that we have a (^-irreducible Markov chain 0. Suppose moreover 
that there are functions V : X — > [a, oo), /3: X — > [a, oo), /: X — > [a, oo), for some a > 0, small 
set C, constant b E R and consider: 

£[Wt; +1 ) I ^rJ < V{(f> % ) - P(M + & Wc } , (33) 

e[ ^ f(<p k ) | Jr.] < /3(0 r J, z > 0. (34) 
fc=r z 



If a = 1 and (33 ) holds then is positive Harris recurrent with some unique invariant distribution 



7T. If a — 0, (33), (34) hold and (f> is positive Harris recurrent with some unique invariant 



distribution n then we get that lim^oo E[f(<f>t)] < oo. 
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